Dataset statistics
| Number of variables | 21 |
|---|---|
| Number of observations | 5507751 |
| Missing cells | 39301223 |
| Missing cells (%) | 34.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 882.4 MiB |
| Average record size in memory | 168.0 B |
Variable types
| Numeric | 4 |
|---|---|
| DateTime | 2 |
| Text | 3 |
| Categorical | 12 |
PERSON_TYPE is highly imbalanced (85.9%) | Imbalance |
PERSON_INJURY is highly imbalanced (65.7%) | Imbalance |
EJECTION is highly imbalanced (93.3%) | Imbalance |
EMOTIONAL_STATUS is highly imbalanced (74.4%) | Imbalance |
BODILY_INJURY is highly imbalanced (68.6%) | Imbalance |
POSITION_IN_VEHICLE is highly imbalanced (52.8%) | Imbalance |
SAFETY_EQUIPMENT is highly imbalanced (60.2%) | Imbalance |
COMPLAINT is highly imbalanced (75.2%) | Imbalance |
VEHICLE_ID has 224838 (4.1%) missing values | Missing |
PERSON_AGE has 599276 (10.9%) missing values | Missing |
EJECTION has 2679259 (48.6%) missing values | Missing |
EMOTIONAL_STATUS has 2592938 (47.1%) missing values | Missing |
BODILY_INJURY has 2592895 (47.1%) missing values | Missing |
POSITION_IN_VEHICLE has 2678865 (48.6%) missing values | Missing |
SAFETY_EQUIPMENT has 2861973 (52.0%) missing values | Missing |
PED_LOCATION has 5416437 (98.3%) missing values | Missing |
PED_ACTION has 5416538 (98.3%) missing values | Missing |
COMPLAINT has 2592888 (47.1%) missing values | Missing |
PED_ROLE has 194889 (3.5%) missing values | Missing |
CONTRIBUTING_FACTOR_1 has 5417789 (98.4%) missing values | Missing |
CONTRIBUTING_FACTOR_2 has 5417906 (98.4%) missing values | Missing |
PERSON_SEX has 614713 (11.2%) missing values | Missing |
PERSON_AGE is highly skewed (γ1 = 72.22093394) | Skewed |
UNIQUE_ID has unique values | Unique |
PERSON_AGE has 547401 (9.9%) zeros | Zeros |
Reproduction
| Analysis started | 2024-10-29 14:05:50.687965 |
|---|---|
| Analysis finished | 2024-10-29 14:08:29.697496 |
| Duration | 2 minutes and 39.01 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
UNIQUE_ID
Real number (ℝ)
UNIQUE 
| Distinct | 5507751 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9148413.7 |
| Minimum | 10922 |
|---|---|
| Maximum | 13187099 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.0 MiB |
Quantile statistics
| Minimum | 10922 |
|---|---|
| 5-th percentile | 5811028.5 |
| Q1 | 7019331.5 |
| median | 9412847 |
| Q3 | 11508674 |
| 95-th percentile | 12833888 |
| Maximum | 13187099 |
| Range | 13176177 |
| Interquartile range (IQR) | 4489342 |
Descriptive statistics
| Standard deviation | 2665919.8 |
|---|---|
| Coefficient of variation (CV) | 0.29140788 |
| Kurtosis | -0.13772565 |
| Mean | 9148413.7 |
| Median Absolute Deviation (MAD) | 2293893 |
| Skewness | -0.47583741 |
| Sum | 5.0387185 × 1013 |
| Variance | 7.1071284 × 1012 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10249006 | 1 | < 0.1% |
| 11357975 | 1 | < 0.1% |
| 11357395 | 1 | < 0.1% |
| 11357768 | 1 | < 0.1% |
| 11356878 | 1 | < 0.1% |
| 11359067 | 1 | < 0.1% |
| 11358163 | 1 | < 0.1% |
| 11357862 | 1 | < 0.1% |
| 11357054 | 1 | < 0.1% |
| 11356155 | 1 | < 0.1% |
| Other values (5507741) | 5507741 |
| Value | Count | Frequency (%) |
| 10922 | 1 | |
| 79660 | 1 | |
| 79953 | 1 | |
| 79954 | 1 | |
| 81004 | 1 | |
| 81073 | 1 | |
| 81886 | 1 | |
| 82012 | 1 | |
| 82146 | 1 | |
| 82227 | 1 |
| Value | Count | Frequency (%) |
| 13187099 | 1 | |
| 13187098 | 1 | |
| 13187097 | 1 | |
| 13187096 | 1 | |
| 13187095 | 1 | |
| 13187094 | 1 | |
| 13187093 | 1 | |
| 13187092 | 1 | |
| 13187091 | 1 | |
| 13187090 | 1 |
COLLISION_ID
Real number (ℝ)
| Distinct | 1499799 |
|---|---|
| Distinct (%) | 27.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3971172.3 |
| Minimum | 37 |
|---|---|
| Maximum | 4766163 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.0 MiB |
Quantile statistics
| Minimum | 37 |
|---|---|
| 5-th percentile | 3425161.5 |
| Q1 | 3687055 |
| median | 4021234 |
| Q3 | 4371898 |
| 95-th percentile | 4687011.5 |
| Maximum | 4766163 |
| Range | 4766126 |
| Interquartile range (IQR) | 684843 |
Descriptive statistics
| Standard deviation | 655148.48 |
|---|---|
| Coefficient of variation (CV) | 0.16497609 |
| Kurtosis | 17.403828 |
| Mean | 3971172.3 |
| Median Absolute Deviation (MAD) | 341649 |
| Skewness | -3.3930529 |
| Sum | 2.1872228 × 1013 |
| Variance | 4.2921953 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3963775 | 77 | < 0.1% |
| 4691158 | 71 | < 0.1% |
| 3591272 | 66 | < 0.1% |
| 3539636 | 65 | < 0.1% |
| 3504309 | 64 | < 0.1% |
| 3571716 | 62 | < 0.1% |
| 3904409 | 61 | < 0.1% |
| 3691734 | 61 | < 0.1% |
| 3449201 | 60 | < 0.1% |
| 4143411 | 60 | < 0.1% |
| Other values (1499789) | 5507104 |
| Value | Count | Frequency (%) |
| 37 | 1 | |
| 39 | 1 | |
| 40 | 1 | |
| 44 | 1 | |
| 52 | 1 | |
| 55 | 2 | |
| 78 | 1 | |
| 79 | 2 | |
| 104 | 1 | |
| 107 | 1 |
| Value | Count | Frequency (%) |
| 4766163 | 3 | < 0.1% |
| 4766160 | 2 | < 0.1% |
| 4766157 | 6 | |
| 4766156 | 2 | < 0.1% |
| 4766155 | 4 | < 0.1% |
| 4766154 | 3 | < 0.1% |
| 4766152 | 4 | < 0.1% |
| 4766151 | 3 | < 0.1% |
| 4766150 | 3 | < 0.1% |
| 4766148 | 10 |
CRASH_DATE
Date
| Distinct | 4497 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 42.0 MiB |
| Minimum | 2012-07-01 00:00:00 |
|---|---|
| Maximum | 2024-10-22 00:00:00 |
CRASH_TIME
Date
| Distinct | 1440 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 42.0 MiB |
| Minimum | 2024-10-29 00:00:00 |
|---|---|
| Maximum | 2024-10-29 23:59:00 |
PERSON_ID
Text
| Distinct | 5312928 |
|---|---|
| Distinct (%) | 96.5% |
| Missing | 19 |
| Missing (%) | < 0.1% |
| Memory size | 42.0 MiB |
Length
| Max length | 36 |
|---|---|
| Median length | 36 |
| Mean length | 30.558888 |
| Min length | 1 |
Characters and Unicode
| Total characters | 168310166 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5312896 ? |
|---|---|
| Unique (%) | 96.5% |
Sample
| 1st row | 31aa2bc0-f545-444f-8cdb-f1cb5cf00b89 |
|---|---|
| 2nd row | 4629e500-a73e-48dc-b8fb-53124d124b80 |
| 3rd row | ae48c136-1383-45db-83f4-2a5eecfb7cff |
| 4th row | 2782525 |
| 5th row | e038e18f-40fb-4471-99cf-345eae36e064 |
| Value | Count | Frequency (%) |
| 1 | 142787 | 2.6% |
| 2 | 31734 | 0.6% |
| 3 | 11543 | 0.2% |
| 4 | 4672 | 0.1% |
| 5 | 2005 | < 0.1% |
| 6 | 923 | < 0.1% |
| 7 | 448 | < 0.1% |
| 8 | 235 | < 0.1% |
| 9 | 149 | < 0.1% |
| 10 | 91 | < 0.1% |
| Other values (5312918) | 5313145 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 18092964 | 10.7% |
| 4 | 13474881 | 8.0% |
| 9 | 10087515 | 6.0% |
| 8 | 10076947 | 6.0% |
| a | 9611356 | 5.7% |
| b | 9611229 | 5.7% |
| 1 | 9367153 | 5.6% |
| 2 | 9234786 | 5.5% |
| 3 | 9021796 | 5.4% |
| 7 | 8954703 | 5.3% |
| Other values (7) | 60776836 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 97068925 | |
| Lowercase Letter | 53148277 | |
| Dash Punctuation | 18092964 | 10.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 13474881 | |
| 9 | 10087515 | |
| 8 | 10076947 | |
| 1 | 9367153 | |
| 2 | 9234786 | |
| 3 | 9021796 | |
| 7 | 8954703 | |
| 6 | 8954668 | |
| 0 | 8948906 | |
| 5 | 8947570 |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 9611356 | |
| b | 9611229 | |
| d | 8488286 | |
| f | 8480361 | |
| e | 8478542 | |
| c | 8478503 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 18092964 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 115161889 | |
| Latin | 53148277 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 18092964 | |
| 4 | 13474881 | |
| 9 | 10087515 | |
| 8 | 10076947 | |
| 1 | 9367153 | |
| 2 | 9234786 | |
| 3 | 9021796 | |
| 7 | 8954703 | |
| 6 | 8954668 | |
| 0 | 8948906 |
Latin
| Value | Count | Frequency (%) |
| a | 9611356 | |
| b | 9611229 | |
| d | 8488286 | |
| f | 8480361 | |
| e | 8478542 | |
| c | 8478503 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 168310166 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 18092964 | 10.7% |
| 4 | 13474881 | 8.0% |
| 9 | 10087515 | 6.0% |
| 8 | 10076947 | 6.0% |
| a | 9611356 | 5.7% |
| b | 9611229 | 5.7% |
| 1 | 9367153 | 5.6% |
| 2 | 9234786 | 5.5% |
| 3 | 9021796 | 5.4% |
| 7 | 8954703 | 5.3% |
| Other values (7) | 60776836 |
PERSON_TYPE
Categorical
IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 42.0 MiB |
| Occupant | |
|---|---|
| Pedestrian | 131361 |
| Bicyclist | 71227 |
| Other Motorized | 10396 |
Length
| Max length | 15 |
|---|---|
| Median length | 8 |
| Mean length | 8.0738452 |
| Min length | 8 |
Characters and Unicode
| Total characters | 44468729 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Occupant |
|---|---|
| 2nd row | Occupant |
| 3rd row | Occupant |
| 4th row | Occupant |
| 5th row | Occupant |
Common Values
| Value | Count | Frequency (%) |
| Occupant | 5294767 | |
| Pedestrian | 131361 | 2.4% |
| Bicyclist | 71227 | 1.3% |
| Other Motorized | 10396 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| occupant | 5294767 | |
| pedestrian | 131361 | 2.4% |
| bicyclist | 71227 | 1.3% |
| other | 10396 | 0.2% |
| motorized | 10396 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 10731988 | |
| t | 5518147 | |
| a | 5426128 | |
| n | 5426128 | |
| O | 5305163 | |
| u | 5294767 | |
| p | 5294767 | |
| i | 284211 | 0.6% |
| e | 283514 | 0.6% |
| s | 202588 | 0.5% |
| Other values (11) | 701328 | 1.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 38940186 | |
| Uppercase Letter | 5518147 | 12.4% |
| Space Separator | 10396 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 10731988 | |
| t | 5518147 | |
| a | 5426128 | |
| n | 5426128 | |
| u | 5294767 | |
| p | 5294767 | |
| i | 284211 | 0.7% |
| e | 283514 | 0.7% |
| s | 202588 | 0.5% |
| r | 152153 | 0.4% |
| Other values (6) | 325795 | 0.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 5305163 | |
| P | 131361 | 2.4% |
| B | 71227 | 1.3% |
| M | 10396 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 10396 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 44458333 | |
| Common | 10396 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| c | 10731988 | |
| t | 5518147 | |
| a | 5426128 | |
| n | 5426128 | |
| O | 5305163 | |
| u | 5294767 | |
| p | 5294767 | |
| i | 284211 | 0.6% |
| e | 283514 | 0.6% |
| s | 202588 | 0.5% |
| Other values (10) | 690932 | 1.6% |
Common
| Value | Count | Frequency (%) |
| 10396 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 44468729 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 10731988 | |
| t | 5518147 | |
| a | 5426128 | |
| n | 5426128 | |
| O | 5305163 | |
| u | 5294767 | |
| p | 5294767 | |
| i | 284211 | 0.6% |
| e | 283514 | 0.6% |
| s | 202588 | 0.5% |
| Other values (11) | 701328 | 1.6% |
PERSON_INJURY
Categorical
IMBALANCE 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 42.0 MiB |
| Unspecified | |
|---|---|
| Injured | |
| Killed | 3263 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.506554 |
| Min length | 6 |
Characters and Unicode
| Total characters | 57867486 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unspecified |
|---|---|
| 2nd row | Unspecified |
| 3rd row | Unspecified |
| 4th row | Unspecified |
| 5th row | Unspecified |
Common Values
| Value | Count | Frequency (%) |
| Unspecified | 4829123 | |
| Injured | 675365 | 12.3% |
| Killed | 3263 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unspecified | 4829123 | |
| injured | 675365 | 12.3% |
| killed | 3263 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 10336874 | |
| i | 9661509 | |
| d | 5507751 | |
| n | 5504488 | |
| U | 4829123 | |
| s | 4829123 | |
| p | 4829123 | |
| c | 4829123 | |
| f | 4829123 | |
| I | 675365 | 1.2% |
| Other values (5) | 2035884 | 3.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 52359735 | |
| Uppercase Letter | 5507751 | 9.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 10336874 | |
| i | 9661509 | |
| d | 5507751 | |
| n | 5504488 | |
| s | 4829123 | |
| p | 4829123 | |
| c | 4829123 | |
| f | 4829123 | |
| j | 675365 | 1.3% |
| u | 675365 | 1.3% |
| Other values (2) | 681891 | 1.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 4829123 | |
| I | 675365 | 12.3% |
| K | 3263 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 57867486 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 10336874 | |
| i | 9661509 | |
| d | 5507751 | |
| n | 5504488 | |
| U | 4829123 | |
| s | 4829123 | |
| p | 4829123 | |
| c | 4829123 | |
| f | 4829123 | |
| I | 675365 | 1.2% |
| Other values (5) | 2035884 | 3.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 57867486 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 10336874 | |
| i | 9661509 | |
| d | 5507751 | |
| n | 5504488 | |
| U | 4829123 | |
| s | 4829123 | |
| p | 4829123 | |
| c | 4829123 | |
| f | 4829123 | |
| I | 675365 | 1.2% |
| Other values (5) | 2035884 | 3.5% |
VEHICLE_ID
Real number (ℝ)
MISSING 
| Distinct | 2550256 |
|---|---|
| Distinct (%) | 48.3% |
| Missing | 224838 |
| Missing (%) | 4.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18582262 |
| Minimum | 123423 |
|---|---|
| Maximum | 20771082 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 42.0 MiB |
Quantile statistics
| Minimum | 123423 |
|---|---|
| 5-th percentile | 17037408 |
| Q1 | 17563631 |
| median | 18729960 |
| Q3 | 19800448 |
| 95-th percentile | 20567251 |
| Maximum | 20771082 |
| Range | 20647659 |
| Interquartile range (IQR) | 2236817 |
Descriptive statistics
| Standard deviation | 1580138.1 |
|---|---|
| Coefficient of variation (CV) | 0.085034754 |
| Kurtosis | 7.5897347 |
| Mean | 18582262 |
| Median Absolute Deviation (MAD) | 1125663 |
| Skewness | -1.8170387 |
| Sum | 9.8168472 × 1013 |
| Variance | 2.4968363 × 1012 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18590796 | 71 | < 0.1% |
| 17075216 | 63 | < 0.1% |
| 17334601 | 63 | < 0.1% |
| 17364088 | 60 | < 0.1% |
| 18954743 | 58 | < 0.1% |
| 18968693 | 58 | < 0.1% |
| 17483298 | 58 | < 0.1% |
| 17826063 | 58 | < 0.1% |
| 17521817 | 57 | < 0.1% |
| 19106096 | 57 | < 0.1% |
| Other values (2550246) | 5282310 | |
| (Missing) | 224838 | 4.1% |
| Value | Count | Frequency (%) |
| 123423 | 1 | < 0.1% |
| 602947 | 2 | |
| 611686 | 1 | < 0.1% |
| 620307 | 1 | < 0.1% |
| 621082 | 2 | |
| 622848 | 3 | |
| 625915 | 1 | < 0.1% |
| 628019 | 1 | < 0.1% |
| 629935 | 1 | < 0.1% |
| 630993 | 3 |
| Value | Count | Frequency (%) |
| 20771082 | 1 | |
| 20771081 | 1 | |
| 20771079 | 2 | |
| 20771078 | 2 | |
| 20771077 | 2 | |
| 20771072 | 1 | |
| 20771071 | 1 | |
| 20771070 | 1 | |
| 20771069 | 1 | |
| 20771068 | 2 |
PERSON_AGE
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 896 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 599276 |
| Missing (%) | 10.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.336707 |
| Minimum | -999 |
|---|---|
| Maximum | 9999 |
| Zeros | 547401 |
| Zeros (%) | 9.9% |
| Negative | 1205 |
| Negative (%) | < 0.1% |
| Memory size | 42.0 MiB |
Quantile statistics
| Minimum | -999 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 24 |
| median | 36 |
| Q3 | 50 |
| 95-th percentile | 68 |
| Maximum | 9999 |
| Range | 10998 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 113.26323 |
|---|---|
| Coefficient of variation (CV) | 3.0335623 |
| Kurtosis | 5856.8109 |
| Mean | 37.336707 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 72.220934 |
| Sum | 1.832663 × 108 |
| Variance | 12828.559 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 547401 | 9.9% |
| 30 | 112254 | 2.0% |
| 29 | 111764 | 2.0% |
| 28 | 111399 | 2.0% |
| 27 | 110861 | 2.0% |
| 31 | 108187 | 2.0% |
| 26 | 107508 | 2.0% |
| 32 | 106776 | 1.9% |
| 33 | 103839 | 1.9% |
| 25 | 103456 | 1.9% |
| Other values (886) | 3385030 | |
| (Missing) | 599276 | 10.9% |
| Value | Count | Frequency (%) |
| -999 | 8 | |
| -997 | 2 | < 0.1% |
| -996 | 1 | < 0.1% |
| -992 | 2 | < 0.1% |
| -991 | 1 | < 0.1% |
| -990 | 3 | < 0.1% |
| -989 | 1 | < 0.1% |
| -987 | 1 | < 0.1% |
| -982 | 3 | < 0.1% |
| -980 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9999 | 417 | |
| 9262 | 1 | < 0.1% |
| 9232 | 1 | < 0.1% |
| 9211 | 1 | < 0.1% |
| 9191 | 1 | < 0.1% |
| 9151 | 1 | < 0.1% |
| 9131 | 1 | < 0.1% |
| 9122 | 1 | < 0.1% |
| 8051 | 1 | < 0.1% |
| 8041 | 1 | < 0.1% |
EJECTION
Categorical
IMBALANCE  MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2679259 |
| Missing (%) | 48.6% |
| Memory size | 42.0 MiB |
| Not Ejected | |
|---|---|
| Ejected | 26536 |
| Does Not Apply | 15891 |
| Partially Ejected | 11563 |
| Trapped | 1323 |
Length
| Max length | 17 |
|---|---|
| Median length | 11 |
| Mean length | 11.00122 |
| Min length | 7 |
Characters and Unicode
| Total characters | 31116863 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Not Ejected |
|---|---|
| 2nd row | Not Ejected |
| 3rd row | Not Ejected |
| 4th row | Not Ejected |
| 5th row | Not Ejected |
Common Values
| Value | Count | Frequency (%) |
| Not Ejected | 2772638 | |
| Ejected | 26536 | 0.5% |
| Does Not Apply | 15891 | 0.3% |
| Partially Ejected | 11563 | 0.2% |
| Trapped | 1323 | < 0.1% |
| Unknown | 541 | < 0.1% |
| (Missing) | 2679259 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ejected | 2810737 | |
| not | 2788529 | |
| does | 15891 | 0.3% |
| apply | 15891 | 0.3% |
| partially | 11563 | 0.2% |
| trapped | 1323 | < 0.1% |
| unknown | 541 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5638688 | |
| t | 5610829 | |
| 2815983 | ||
| d | 2812060 | |
| E | 2810737 | |
| j | 2810737 | |
| c | 2810737 | |
| o | 2804961 | |
| N | 2788529 | |
| l | 39017 | 0.1% |
| Other values (14) | 174585 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 22656405 | |
| Uppercase Letter | 5644475 | 18.1% |
| Space Separator | 2815983 | 9.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 5638688 | |
| t | 5610829 | |
| d | 2812060 | |
| j | 2810737 | |
| c | 2810737 | |
| o | 2804961 | |
| l | 39017 | 0.2% |
| p | 34428 | 0.2% |
| y | 27454 | 0.1% |
| a | 24449 | 0.1% |
| Other values (6) | 43045 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 2810737 | |
| N | 2788529 | |
| A | 15891 | 0.3% |
| D | 15891 | 0.3% |
| P | 11563 | 0.2% |
| T | 1323 | < 0.1% |
| U | 541 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 2815983 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 28300880 | |
| Common | 2815983 | 9.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 5638688 | |
| t | 5610829 | |
| d | 2812060 | |
| E | 2810737 | |
| j | 2810737 | |
| c | 2810737 | |
| o | 2804961 | |
| N | 2788529 | |
| l | 39017 | 0.1% |
| p | 34428 | 0.1% |
| Other values (13) | 140157 | 0.5% |
Common
| Value | Count | Frequency (%) |
| 2815983 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 31116863 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 5638688 | |
| t | 5610829 | |
| 2815983 | ||
| d | 2812060 | |
| E | 2810737 | |
| j | 2810737 | |
| c | 2810737 | |
| o | 2804961 | |
| N | 2788529 | |
| l | 39017 | 0.1% |
| Other values (14) | 174585 | 0.6% |
EMOTIONAL_STATUS
Categorical
IMBALANCE  MISSING 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2592938 |
| Missing (%) | 47.1% |
| Memory size | 42.0 MiB |
| Does Not Apply | |
|---|---|
| Conscious | |
| Unknown | 14887 |
| Shock | 14452 |
| Semiconscious | 2840 |
| Other values (3) | 6546 |
Length
| Max length | 14 |
|---|---|
| Median length | 14 |
| Mean length | 13.095704 |
| Min length | 5 |
Characters and Unicode
| Total characters | 38171528 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Does Not Apply |
|---|---|
| 2nd row | Does Not Apply |
| 3rd row | Conscious |
| 4th row | Conscious |
| 5th row | Does Not Apply |
Common Values
| Value | Count | Frequency (%) |
| Does Not Apply | 2399462 | |
| Conscious | 476626 | 8.7% |
| Unknown | 14887 | 0.3% |
| Shock | 14452 | 0.3% |
| Semiconscious | 2840 | 0.1% |
| Unconscious | 2705 | < 0.1% |
| Apparent Death | 1968 | < 0.1% |
| Incoherent | 1873 | < 0.1% |
| (Missing) | 2592938 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| does | 2399462 | |
| not | 2399462 | |
| apply | 2399462 | |
| conscious | 476626 | 6.2% |
| unknown | 14887 | 0.2% |
| shock | 14452 | 0.2% |
| semiconscious | 2840 | < 0.1% |
| unconscious | 2705 | < 0.1% |
| apparent | 1968 | < 0.1% |
| death | 1968 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 5794478 | |
| p | 4802860 | |
| 4800892 | ||
| s | 3363804 | |
| e | 2409984 | |
| t | 2405271 | |
| D | 2401430 | |
| A | 2401430 | |
| N | 2399462 | |
| l | 2399462 | |
| Other values (15) | 4992455 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 25654931 | |
| Uppercase Letter | 7715705 | 20.2% |
| Space Separator | 4800892 | 12.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 5794478 | |
| p | 4802860 | |
| s | 3363804 | |
| e | 2409984 | |
| t | 2405271 | |
| l | 2399462 | |
| y | 2399462 | |
| n | 535251 | 2.1% |
| c | 504041 | 2.0% |
| i | 485011 | 1.9% |
| Other values (7) | 555307 | 2.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 2401430 | |
| A | 2401430 | |
| N | 2399462 | |
| C | 476626 | 6.2% |
| U | 17592 | 0.2% |
| S | 17292 | 0.2% |
| I | 1873 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 4800892 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 33370636 | |
| Common | 4800892 | 12.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 5794478 | |
| p | 4802860 | |
| s | 3363804 | |
| e | 2409984 | |
| t | 2405271 | |
| D | 2401430 | |
| A | 2401430 | |
| N | 2399462 | |
| l | 2399462 | |
| y | 2399462 | |
| Other values (14) | 2592993 |
Common
| Value | Count | Frequency (%) |
| 4800892 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 38171528 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 5794478 | |
| p | 4802860 | |
| 4800892 | ||
| s | 3363804 | |
| e | 2409984 | |
| t | 2405271 | |
| D | 2401430 | |
| A | 2401430 | |
| N | 2399462 | |
| l | 2399462 | |
| Other values (15) | 4992455 |
BODILY_INJURY
Categorical
IMBALANCE  MISSING 
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2592895 |
| Missing (%) | 47.1% |
| Memory size | 42.0 MiB |
| Does Not Apply | |
|---|---|
| Back | 80790 |
| Neck | 77553 |
| Knee-Lower Leg Foot | 74430 |
| Head | 67113 |
| Other values (9) | 184695 |
Length
| Max length | 20 |
|---|---|
| Median length | 14 |
| Mean length | 13.29725 |
| Min length | 3 |
Characters and Unicode
| Total characters | 38759570 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Does Not Apply |
|---|---|
| 2nd row | Does Not Apply |
| 3rd row | Back |
| 4th row | Shoulder - Upper Arm |
| 5th row | Does Not Apply |
Common Values
| Value | Count | Frequency (%) |
| Does Not Apply | 2430275 | |
| Back | 80790 | 1.5% |
| Neck | 77553 | 1.4% |
| Knee-Lower Leg Foot | 74430 | 1.4% |
| Head | 67113 | 1.2% |
| Entire Body | 39667 | 0.7% |
| Elbow-Lower-Arm-Hand | 33361 | 0.6% |
| Shoulder - Upper Arm | 33301 | 0.6% |
| Unknown | 21110 | 0.4% |
| Chest | 17646 | 0.3% |
| Other values (4) | 39610 | 0.7% |
| (Missing) | 2592895 |
Length
| Value | Count | Frequency (%) |
| does | 2430275 | |
| apply | 2430275 | |
| not | 2430275 | |
| leg | 91599 | 1.1% |
| back | 80790 | 1.0% |
| neck | 77553 | 1.0% |
| knee-lower | 74430 | 0.9% |
| foot | 74430 | 0.9% |
| head | 67113 | 0.8% |
| 41810 | 0.5% | |
| Other values (13) | 299473 | 3.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 5253149 | |
| 5183167 | ||
| p | 4978659 | |
| e | 3095225 | |
| t | 2562018 | |
| N | 2507828 | |
| A | 2505446 | |
| l | 2505446 | |
| y | 2470862 | |
| s | 2456430 | |
| Other values (26) | 5241340 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 25095016 | |
| Uppercase Letter | 8247895 | 21.3% |
| Space Separator | 5183167 | 13.4% |
| Dash Punctuation | 233492 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 5253149 | |
| p | 4978659 | |
| e | 3095225 | |
| t | 2562018 | |
| l | 2505446 | |
| y | 2470862 | |
| s | 2456430 | |
| r | 297891 | 1.2% |
| n | 219297 | 0.9% |
| a | 194276 | 0.8% |
| Other values (11) | 1061763 | 4.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 2507828 | |
| A | 2505446 | |
| D | 2430275 | |
| L | 199390 | 2.4% |
| B | 120457 | 1.5% |
| H | 117643 | 1.4% |
| F | 87442 | 1.1% |
| K | 74430 | 0.9% |
| E | 73948 | 0.9% |
| U | 71580 | 0.9% |
| Other values (3) | 59456 | 0.7% |
Space Separator
| Value | Count | Frequency (%) |
| 5183167 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 233492 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 33342911 | |
| Common | 5416659 | 14.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 5253149 | |
| p | 4978659 | |
| e | 3095225 | |
| t | 2562018 | |
| N | 2507828 | |
| A | 2505446 | |
| l | 2505446 | |
| y | 2470862 | |
| s | 2456430 | |
| D | 2430275 | |
| Other values (24) | 2577573 |
Common
| Value | Count | Frequency (%) |
| 5183167 | ||
| - | 233492 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 38759570 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 5253149 | |
| 5183167 | ||
| p | 4978659 | |
| e | 3095225 | |
| t | 2562018 | |
| N | 2507828 | |
| A | 2505446 | |
| l | 2505446 | |
| y | 2470862 | |
| s | 2456430 | |
| Other values (26) | 5241340 |
POSITION_IN_VEHICLE
Categorical
IMBALANCE  MISSING 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2678865 |
| Missing (%) | 48.6% |
| Memory size | 42.0 MiB |
| Driver | |
|---|---|
| Front passenger, if two or more persons, including the driver, are in the front seat | |
| Right rear passenger or motorcycle sidecar passenger | 141253 |
| Left rear passenger, or rear passenger on a bicycle, motorcycle, snowmobile | 132441 |
| Any person in the rear of a station wagon, pick-up truck, all passengers on a bus, etc | 76572 |
| Other values (6) | 156140 |
Length
| Max length | 86 |
|---|---|
| Median length | 6 |
| Mean length | 24.567415 |
| Min length | 6 |
Characters and Unicode
| Total characters | 69498415 |
|---|---|
| Distinct characters | 39 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Front passenger, if two or more persons, including the driver, are in the front seat |
|---|---|
| 2nd row | Right rear passenger or motorcycle sidecar passenger |
| 3rd row | Driver |
| 4th row | Driver |
| 5th row | Driver |
Common Values
| Value | Count | Frequency (%) |
| Driver | 1975733 | |
| Front passenger, if two or more persons, including the driver, are in the front seat | 346747 | 6.3% |
| Right rear passenger or motorcycle sidecar passenger | 141253 | 2.6% |
| Left rear passenger, or rear passenger on a bicycle, motorcycle, snowmobile | 132441 | 2.4% |
| Any person in the rear of a station wagon, pick-up truck, all passengers on a bus, etc | 76572 | 1.4% |
| Unknown | 67074 | 1.2% |
| Middle rear seat, or passenger lying across a seat | 42750 | 0.8% |
| Middle front seat, or passenger lying across a seat | 34595 | 0.6% |
| Riding/Hanging on Outside | 7543 | 0.1% |
| Does Not Apply | 3246 | 0.1% |
| (Missing) | 2678865 |
Length
| Value | Count | Frequency (%) |
| driver | 2322480 | |
| passenger | 971480 | 8.3% |
| the | 770066 | 6.6% |
| front | 728089 | 6.2% |
| or | 697786 | 5.9% |
| rear | 525457 | 4.5% |
| seat | 501437 | 4.3% |
| in | 423319 | 3.6% |
| a | 362930 | 3.1% |
| if | 347679 | 3.0% |
| Other values (38) | 4077495 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 9858274 | |
| 8899332 | ||
| e | 8314925 | |
| i | 4672366 | 6.7% |
| n | 4200842 | 6.0% |
| s | 4042088 | 5.8% |
| o | 3957733 | 5.7% |
| a | 3244208 | 4.7% |
| t | 3212668 | 4.6% |
| v | 2322480 | 3.3% |
| Other values (29) | 16773499 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 55918015 | |
| Space Separator | 8899332 | 12.8% |
| Uppercase Letter | 2850464 | 4.1% |
| Other Punctuation | 1754032 | 2.5% |
| Dash Punctuation | 76572 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 9858274 | |
| e | 8314925 | |
| i | 4672366 | |
| n | 4200842 | |
| s | 4042088 | 7.2% |
| o | 3957733 | 7.1% |
| a | 3244208 | 5.8% |
| t | 3212668 | 5.7% |
| v | 2322480 | 4.2% |
| g | 1712598 | 3.1% |
| Other values (12) | 10379833 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 1978979 | |
| F | 346747 | 12.2% |
| R | 148796 | 5.2% |
| L | 132441 | 4.6% |
| A | 79818 | 2.8% |
| M | 77345 | 2.7% |
| U | 67074 | 2.4% |
| H | 7543 | 0.3% |
| O | 7543 | 0.3% |
| N | 3246 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1744625 | |
| / | 7543 | 0.4% |
| & | 932 | 0.1% |
| ; | 932 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 8899332 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 76572 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 58768479 | |
| Common | 10729936 | 15.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 9858274 | |
| e | 8314925 | |
| i | 4672366 | 8.0% |
| n | 4200842 | 7.1% |
| s | 4042088 | 6.9% |
| o | 3957733 | 6.7% |
| a | 3244208 | 5.5% |
| t | 3212668 | 5.5% |
| v | 2322480 | 4.0% |
| D | 1978979 | 3.4% |
| Other values (23) | 12963916 |
Common
| Value | Count | Frequency (%) |
| 8899332 | ||
| , | 1744625 | 16.3% |
| - | 76572 | 0.7% |
| / | 7543 | 0.1% |
| & | 932 | < 0.1% |
| ; | 932 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 69498415 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 9858274 | |
| 8899332 | ||
| e | 8314925 | |
| i | 4672366 | 6.7% |
| n | 4200842 | 6.0% |
| s | 4042088 | 5.8% |
| o | 3957733 | 5.7% |
| a | 3244208 | 4.7% |
| t | 3212668 | 4.6% |
| v | 2322480 | 3.3% |
| Other values (29) | 16773499 |
SAFETY_EQUIPMENT
Categorical
IMBALANCE  MISSING 
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2861973 |
| Missing (%) | 52.0% |
| Memory size | 42.0 MiB |
| Lap Belt & Harness | |
|---|---|
| Unknown | |
| Lap Belt | |
| Child Restraint Only | 45832 |
| Air Bag Deployed/Lap Belt/Harness | 19552 |
| Other values (12) | 74271 |
Length
| Max length | 40 |
|---|---|
| Median length | 18 |
| Mean length | 14.850971 |
| Min length | 1 |
Characters and Unicode
| Total characters | 39292373 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Lap Belt & Harness |
|---|---|
| 2nd row | Lap Belt |
| 3rd row | Lap Belt & Harness |
| 4th row | Lap Belt & Harness |
| 5th row | Lap Belt & Harness |
Common Values
| Value | Count | Frequency (%) |
| Lap Belt & Harness | 1685514 | |
| Unknown | 445545 | 8.1% |
| Lap Belt | 375064 | 6.8% |
| Child Restraint Only | 45832 | 0.8% |
| Air Bag Deployed/Lap Belt/Harness | 19552 | 0.4% |
| Other | 15433 | 0.3% |
| Helmet (Motorcycle Only) | 14278 | 0.3% |
| Harness | 12531 | 0.2% |
| Helmet Only (In-Line Skater/Bicyclist) | 10647 | 0.2% |
| - | 7185 | 0.1% |
| Other values (7) | 14197 | 0.3% |
| (Missing) | 2861973 |
Length
| Value | Count | Frequency (%) |
| belt | 2063714 | |
| lap | 2060580 | |
| harness | 1698045 | |
| 1692699 | ||
| unknown | 445545 | 5.4% |
| only | 71394 | 0.9% |
| restraint | 46400 | 0.6% |
| child | 45832 | 0.6% |
| air | 29927 | 0.4% |
| bag | 29927 | 0.4% |
| Other values (12) | 136871 | 1.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5675156 | ||
| e | 4025715 | |
| a | 3891748 | |
| s | 3496702 | |
| n | 3200962 | 8.1% |
| l | 2287842 | 5.8% |
| t | 2266554 | 5.8% |
| B | 2127662 | 5.4% |
| p | 2114295 | 5.4% |
| L | 2097735 | 5.3% |
| Other values (27) | 8108002 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 25088939 | |
| Uppercase Letter | 6703162 | 17.1% |
| Space Separator | 5675156 | 14.4% |
| Other Punctuation | 1745974 | 4.4% |
| Open Punctuation | 28745 | 0.1% |
| Close Punctuation | 28745 | 0.1% |
| Dash Punctuation | 21652 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 4025715 | |
| a | 3891748 | |
| s | 3496702 | |
| n | 3200962 | |
| l | 2287842 | |
| t | 2266554 | |
| p | 2114295 | |
| r | 1841837 | |
| o | 504578 | 2.0% |
| k | 460012 | 1.8% |
| Other values (8) | 998694 | 4.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 2127662 | |
| L | 2097735 | |
| H | 1745707 | |
| U | 445545 | 6.6% |
| O | 90010 | 1.3% |
| R | 46400 | 0.7% |
| C | 46400 | 0.7% |
| A | 29927 | 0.4% |
| D | 29927 | 0.4% |
| S | 15017 | 0.2% |
| Other values (3) | 28832 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| & | 1685514 | |
| / | 60460 | 3.5% |
Space Separator
| Value | Count | Frequency (%) |
| 5675156 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 28745 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 28745 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 21652 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 31792101 | |
| Common | 7500272 | 19.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 4025715 | |
| a | 3891748 | |
| s | 3496702 | |
| n | 3200962 | |
| l | 2287842 | |
| t | 2266554 | |
| B | 2127662 | |
| p | 2114295 | |
| L | 2097735 | |
| r | 1841837 | |
| Other values (21) | 4441049 |
Common
| Value | Count | Frequency (%) |
| 5675156 | ||
| & | 1685514 | 22.5% |
| / | 60460 | 0.8% |
| ( | 28745 | 0.4% |
| ) | 28745 | 0.4% |
| - | 21652 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 39292373 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5675156 | ||
| e | 4025715 | |
| a | 3891748 | |
| s | 3496702 | |
| n | 3200962 | 8.1% |
| l | 2287842 | 5.8% |
| t | 2266554 | 5.8% |
| B | 2127662 | 5.4% |
| p | 2114295 | 5.4% |
| L | 2097735 | 5.3% |
| Other values (27) | 8108002 |
PED_LOCATION
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 5416437 |
| Missing (%) | 98.3% |
| Memory size | 42.0 MiB |
| Pedestrian/Bicyclist/Other Pedestrian at Intersection | |
|---|---|
| Pedestrian/Bicyclist/Other Pedestrian Not at Intersection | |
| Does Not Apply | 3638 |
| Unknown | 2582 |
Length
| Max length | 57 |
|---|---|
| Median length | 53 |
| Mean length | 51.441619 |
| Min length | 7 |
Characters and Unicode
| Total characters | 4697340 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Pedestrian/Bicyclist/Other Pedestrian at Intersection |
|---|---|
| 2nd row | Pedestrian/Bicyclist/Other Pedestrian at Intersection |
| 3rd row | Pedestrian/Bicyclist/Other Pedestrian Not at Intersection |
| 4th row | Pedestrian/Bicyclist/Other Pedestrian at Intersection |
| 5th row | Pedestrian/Bicyclist/Other Pedestrian at Intersection |
Common Values
| Value | Count | Frequency (%) |
| Pedestrian/Bicyclist/Other Pedestrian at Intersection | 55506 | 1.0% |
| Pedestrian/Bicyclist/Other Pedestrian Not at Intersection | 29588 | 0.5% |
| Does Not Apply | 3638 | 0.1% |
| Unknown | 2582 | < 0.1% |
| (Missing) | 5416437 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pedestrian/bicyclist/other | 85094 | |
| pedestrian | 85094 | |
| at | 85094 | |
| intersection | 85094 | |
| not | 33226 | 8.7% |
| does | 3638 | 0.9% |
| apply | 3638 | 0.9% |
| unknown | 2582 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 628884 | |
| e | 599296 | |
| i | 425470 | |
| n | 348122 | 7.4% |
| s | 344014 | 7.3% |
| r | 340376 | 7.2% |
| 292146 | 6.2% | |
| a | 255282 | 5.4% |
| c | 255282 | 5.4% |
| P | 170188 | 3.6% |
| Other values (16) | 1038280 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3766452 | |
| Uppercase Letter | 468554 | 10.0% |
| Space Separator | 292146 | 6.2% |
| Other Punctuation | 170188 | 3.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 628884 | |
| e | 599296 | |
| i | 425470 | |
| n | 348122 | |
| s | 344014 | |
| r | 340376 | |
| a | 255282 | |
| c | 255282 | |
| d | 170188 | 4.5% |
| o | 124540 | 3.3% |
| Other values (6) | 274998 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 170188 | |
| O | 85094 | |
| I | 85094 | |
| B | 85094 | |
| N | 33226 | 7.1% |
| D | 3638 | 0.8% |
| A | 3638 | 0.8% |
| U | 2582 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 292146 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 170188 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4235006 | |
| Common | 462334 | 9.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 628884 | |
| e | 599296 | |
| i | 425470 | |
| n | 348122 | |
| s | 344014 | |
| r | 340376 | |
| a | 255282 | 6.0% |
| c | 255282 | 6.0% |
| P | 170188 | 4.0% |
| d | 170188 | 4.0% |
| Other values (14) | 697904 |
Common
| Value | Count | Frequency (%) |
| 292146 | ||
| / | 170188 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4697340 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 628884 | |
| e | 599296 | |
| i | 425470 | |
| n | 348122 | 7.4% |
| s | 344014 | 7.3% |
| r | 340376 | 7.2% |
| 292146 | 6.2% | |
| a | 255282 | 5.4% |
| c | 255282 | 5.4% |
| P | 170188 | 3.6% |
| Other values (16) | 1038280 |
PED_ACTION
Categorical
MISSING 
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 5416538 |
| Missing (%) | 98.3% |
| Memory size | 42.0 MiB |
| Crossing With Signal | |
|---|---|
| Crossing, No Signal, or Crosswalk | |
| Crossing, No Signal, Marked Crosswalk | |
| Other Actions in Roadway | |
| Crossing Against Signal | |
| Other values (11) |
Length
| Max length | 47 |
|---|---|
| Median length | 44 |
| Mean length | 24.457424 |
| Min length | 7 |
Characters and Unicode
| Total characters | 2230835 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Crossing With Signal |
|---|---|
| 2nd row | Crossing With Signal |
| 3rd row | Crossing, No Signal, or Crosswalk |
| 4th row | Crossing With Signal |
| 5th row | Crossing With Signal |
Common Values
| Value | Count | Frequency (%) |
| Crossing With Signal | 34140 | 0.6% |
| Crossing, No Signal, or Crosswalk | 15484 | 0.3% |
| Crossing, No Signal, Marked Crosswalk | 7819 | 0.1% |
| Other Actions in Roadway | 7153 | 0.1% |
| Crossing Against Signal | 6347 | 0.1% |
| Unknown | 4356 | 0.1% |
| Not in Roadway | 4313 | 0.1% |
| Does Not Apply | 4058 | 0.1% |
| Emerging from in Front of/Behind Parked Vehicle | 2864 | 0.1% |
| Working in Roadway | 1387 | < 0.1% |
| Other values (6) | 3292 | 0.1% |
| (Missing) | 5416538 |
Length
| Value | Count | Frequency (%) |
| crossing | 63790 | |
| signal | 63790 | |
| with | 35057 | |
| crosswalk | 23303 | 6.9% |
| no | 23303 | 6.9% |
| in | 16229 | 4.8% |
| or | 15484 | 4.6% |
| roadway | 13365 | 4.0% |
| other | 8409 | 2.5% |
| not | 8371 | 2.5% |
| Other values (29) | 66655 |
Most occurring characters
| Value | Count | Frequency (%) |
| 246543 | ||
| i | 212305 | 9.5% |
| s | 193607 | 8.7% |
| n | 189361 | 8.5% |
| o | 177802 | 8.0% |
| g | 148526 | 6.7% |
| a | 136813 | 6.1% |
| r | 133401 | 6.0% |
| l | 99554 | 4.5% |
| C | 87322 | 3.9% |
| Other values (31) | 605601 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1626387 | |
| Uppercase Letter | 305655 | 13.7% |
| Space Separator | 246543 | 11.1% |
| Other Punctuation | 52250 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 212305 | |
| s | 193607 | |
| n | 189361 | |
| o | 177802 | |
| g | 148526 | |
| a | 136813 | |
| r | 133401 | |
| l | 99554 | |
| t | 71166 | 4.4% |
| h | 54486 | 3.4% |
| Other values (10) | 209366 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 87322 | |
| S | 65196 | |
| W | 37893 | |
| N | 31674 | 10.4% |
| A | 19081 | 6.2% |
| R | 14585 | 4.8% |
| O | 10921 | 3.6% |
| M | 7819 | 2.6% |
| U | 4356 | 1.4% |
| B | 4195 | 1.4% |
| Other values (8) | 22613 | 7.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 46606 | |
| / | 5644 | 10.8% |
Space Separator
| Value | Count | Frequency (%) |
| 246543 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1932042 | |
| Common | 298793 | 13.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 212305 | |
| s | 193607 | |
| n | 189361 | 9.8% |
| o | 177802 | 9.2% |
| g | 148526 | 7.7% |
| a | 136813 | 7.1% |
| r | 133401 | 6.9% |
| l | 99554 | 5.2% |
| C | 87322 | 4.5% |
| t | 71166 | 3.7% |
| Other values (28) | 482185 |
Common
| Value | Count | Frequency (%) |
| 246543 | ||
| , | 46606 | 15.6% |
| / | 5644 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2230835 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 246543 | ||
| i | 212305 | 9.5% |
| s | 193607 | 8.7% |
| n | 189361 | 8.5% |
| o | 177802 | 8.0% |
| g | 148526 | 6.7% |
| a | 136813 | 6.1% |
| r | 133401 | 6.0% |
| l | 99554 | 4.5% |
| C | 87322 | 3.9% |
| Other values (31) | 605601 |
COMPLAINT
Categorical
IMBALANCE  MISSING 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2592888 |
| Missing (%) | 47.1% |
| Memory size | 42.0 MiB |
| Does Not Apply | |
|---|---|
| Complaint of Pain or Nausea | 216456 |
| Complaint of Pain | 88495 |
| None Visible | 49508 |
| Minor Bleeding | 26386 |
| Other values (16) | 102908 |
Length
| Max length | 34 |
|---|---|
| Median length | 14 |
| Mean length | 14.956543 |
| Min length | 7 |
Characters and Unicode
| Total characters | 43596274 |
|---|---|
| Distinct characters | 39 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Does Not Apply |
|---|---|
| 2nd row | Does Not Apply |
| 3rd row | Complaint of Pain or Nausea |
| 4th row | None Visible |
| 5th row | Does Not Apply |
Common Values
| Value | Count | Frequency (%) |
| Does Not Apply | 2431110 | |
| Complaint of Pain or Nausea | 216456 | 3.9% |
| Complaint of Pain | 88495 | 1.6% |
| None Visible | 49508 | 0.9% |
| Minor Bleeding | 26386 | 0.5% |
| Contusion - Bruise | 20870 | 0.4% |
| Unknown | 20454 | 0.4% |
| Whiplash | 19846 | 0.4% |
| Abrasion | 15156 | 0.3% |
| Internal | 7809 | 0.1% |
| Other values (11) | 18773 | 0.3% |
| (Missing) | 2592888 |
Length
| Value | Count | Frequency (%) |
| does | 2431110 | |
| not | 2431110 | |
| apply | 2431110 | |
| complaint | 304951 | 3.4% |
| of | 304951 | 3.4% |
| pain | 304951 | 3.4% |
| or | 216456 | 2.4% |
| nausea | 216456 | 2.4% |
| none | 49508 | 0.6% |
| visible | 49508 | 0.6% |
| Other values (21) | 232655 | 2.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 6057903 | ||
| o | 5873572 | |
| p | 5187177 | |
| e | 2863975 | 6.6% |
| l | 2850573 | 6.5% |
| s | 2798861 | 6.4% |
| t | 2794582 | 6.4% |
| N | 2697074 | 6.2% |
| A | 2446426 | 5.6% |
| D | 2445147 | 5.6% |
| Other values (29) | 7580984 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 29087012 | |
| Uppercase Letter | 8416452 | 19.3% |
| Space Separator | 6057903 | 13.9% |
| Dash Punctuation | 34907 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 5873572 | |
| p | 5187177 | |
| e | 2863975 | |
| l | 2850573 | |
| s | 2798861 | |
| t | 2794582 | |
| y | 2431142 | |
| a | 1104465 | 3.8% |
| i | 870913 | 3.0% |
| n | 869484 | 3.0% |
| Other values (13) | 1442268 | 5.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 2697074 | |
| A | 2446426 | |
| D | 2445147 | |
| C | 330678 | 3.9% |
| P | 304983 | 3.6% |
| B | 51781 | 0.6% |
| V | 49508 | 0.6% |
| M | 27838 | 0.3% |
| U | 20454 | 0.2% |
| W | 19846 | 0.2% |
| Other values (4) | 22717 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 6057903 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 34907 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 37503464 | |
| Common | 6092810 | 14.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 5873572 | |
| p | 5187177 | |
| e | 2863975 | |
| l | 2850573 | |
| s | 2798861 | |
| t | 2794582 | |
| N | 2697074 | |
| A | 2446426 | |
| D | 2445147 | |
| y | 2431142 | |
| Other values (27) | 5114935 |
Common
| Value | Count | Frequency (%) |
| 6057903 | ||
| - | 34907 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 43596274 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6057903 | ||
| o | 5873572 | |
| p | 5187177 | |
| e | 2863975 | 6.6% |
| l | 2850573 | 6.5% |
| s | 2798861 | 6.4% |
| t | 2794582 | 6.4% |
| N | 2697074 | 6.2% |
| A | 2446426 | 5.6% |
| D | 2445147 | 5.6% |
| Other values (29) | 7580984 |
PED_ROLE
Categorical
MISSING 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 194889 |
| Missing (%) | 3.5% |
| Memory size | 42.0 MiB |
| Registrant | |
|---|---|
| Driver | |
| Passenger | |
| Pedestrian | 89617 |
| Witness | 74794 |
| Other values (5) | 41320 |
Length
| Max length | 15 |
|---|---|
| Median length | 14 |
| Mean length | 8.2660745 |
| Min length | 5 |
Characters and Unicode
| Total characters | 43916513 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Registrant |
|---|---|
| 2nd row | Passenger |
| 3rd row | Registrant |
| 4th row | Notified Person |
| 5th row | Passenger |
Common Values
| Value | Count | Frequency (%) |
| Registrant | 2283748 | |
| Driver | 2022958 | |
| Passenger | 800425 | 14.5% |
| Pedestrian | 89617 | 1.6% |
| Witness | 74794 | 1.4% |
| Owner | 27910 | 0.5% |
| Notified Person | 8841 | 0.2% |
| Policy Holder | 2415 | < 0.1% |
| Other | 1776 | < 0.1% |
| In-Line Skater | 378 | < 0.1% |
| (Missing) | 194889 | 3.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| registrant | 2283748 | |
| driver | 2022958 | |
| passenger | 800425 | 15.0% |
| pedestrian | 89617 | 1.7% |
| witness | 74794 | 1.4% |
| owner | 27910 | 0.5% |
| notified | 8841 | 0.2% |
| person | 8841 | 0.2% |
| policy | 2415 | < 0.1% |
| holder | 2415 | < 0.1% |
| Other values (3) | 2532 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 7261026 | |
| e | 6212123 | |
| t | 4742902 | |
| i | 4491592 | |
| s | 4132644 | |
| n | 3286091 | |
| a | 3174168 | |
| g | 3084173 | |
| R | 2283748 | 5.2% |
| D | 2022958 | 4.6% |
| Other values (20) | 3225088 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 38579627 | |
| Uppercase Letter | 5324874 | 12.1% |
| Space Separator | 11634 | < 0.1% |
| Dash Punctuation | 378 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 7261026 | |
| e | 6212123 | |
| t | 4742902 | |
| i | 4491592 | |
| s | 4132644 | |
| n | 3286091 | |
| a | 3174168 | |
| g | 3084173 | |
| v | 2022958 | 5.2% |
| d | 100873 | 0.3% |
| Other values (8) | 71077 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 2283748 | |
| D | 2022958 | |
| P | 901298 | 16.9% |
| W | 74794 | 1.4% |
| O | 29686 | 0.6% |
| N | 8841 | 0.2% |
| H | 2415 | < 0.1% |
| I | 378 | < 0.1% |
| L | 378 | < 0.1% |
| S | 378 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 11634 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 378 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 43904501 | |
| Common | 12012 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 7261026 | |
| e | 6212123 | |
| t | 4742902 | |
| i | 4491592 | |
| s | 4132644 | |
| n | 3286091 | |
| a | 3174168 | |
| g | 3084173 | |
| R | 2283748 | 5.2% |
| D | 2022958 | 4.6% |
| Other values (18) | 3213076 |
Common
| Value | Count | Frequency (%) |
| 11634 | ||
| - | 378 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 43916513 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 7261026 | |
| e | 6212123 | |
| t | 4742902 | |
| i | 4491592 | |
| s | 4132644 | |
| n | 3286091 | |
| a | 3174168 | |
| g | 3084173 | |
| R | 2283748 | 5.2% |
| D | 2022958 | 4.6% |
| Other values (20) | 3225088 |
MISSING 
| Distinct | 53 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 5417789 |
| Missing (%) | 98.4% |
| Memory size | 42.0 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 11 |
| Mean length | 19.59404 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1762719 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Unspecified |
|---|---|
| 2nd row | Unspecified |
| 3rd row | Unspecified |
| 4th row | Unspecified |
| 5th row | Unspecified |
| Value | Count | Frequency (%) |
| unspecified | 62863 | |
| error/confusion | 14199 | 10.2% |
| pedestrian/bicyclist/other | 14199 | 10.2% |
| pedestrian | 14199 | 10.2% |
| driver | 3403 | 2.4% |
| inattention/distraction | 3317 | 2.4% |
| to | 2433 | 1.7% |
| failure | 2371 | 1.7% |
| yield | 2347 | 1.7% |
| right-of-way | 2347 | 1.7% |
| Other values (90) | 17486 | 12.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 226607 | |
| e | 223701 | |
| n | 141140 | 8.0% |
| s | 128644 | 7.3% |
| r | 109046 | 6.2% |
| c | 100279 | 5.7% |
| d | 99107 | 5.6% |
| t | 85246 | 4.8% |
| f | 82933 | 4.7% |
| p | 64088 | 3.6% |
| Other values (42) | 501928 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1475172 | |
| Uppercase Letter | 185649 | 10.5% |
| Space Separator | 49202 | 2.8% |
| Other Punctuation | 46852 | 2.7% |
| Dash Punctuation | 5058 | 0.3% |
| Close Punctuation | 393 | < 0.1% |
| Open Punctuation | 393 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 226607 | |
| e | 223701 | |
| n | 141140 | |
| s | 128644 | |
| r | 109046 | |
| c | 100279 | |
| d | 99107 | |
| t | 85246 | 5.8% |
| f | 82933 | 5.6% |
| p | 64088 | 4.3% |
| Other values (15) | 214381 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 63662 | |
| P | 29546 | |
| C | 16471 | 8.9% |
| O | 16121 | 8.7% |
| B | 14463 | 7.8% |
| E | 14229 | 7.7% |
| D | 8960 | 4.8% |
| I | 5081 | 2.7% |
| R | 2795 | 1.5% |
| F | 2520 | 1.4% |
| Other values (12) | 11801 | 6.4% |
Space Separator
| Value | Count | Frequency (%) |
| 49202 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 46852 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5058 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 393 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 393 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1660821 | |
| Common | 101898 | 5.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 226607 | |
| e | 223701 | |
| n | 141140 | 8.5% |
| s | 128644 | 7.7% |
| r | 109046 | 6.6% |
| c | 100279 | 6.0% |
| d | 99107 | 6.0% |
| t | 85246 | 5.1% |
| f | 82933 | 5.0% |
| p | 64088 | 3.9% |
| Other values (37) | 400030 |
Common
| Value | Count | Frequency (%) |
| 49202 | ||
| / | 46852 | |
| - | 5058 | 5.0% |
| ) | 393 | 0.4% |
| ( | 393 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1762719 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 226607 | |
| e | 223701 | |
| n | 141140 | 8.0% |
| s | 128644 | 7.3% |
| r | 109046 | 6.2% |
| c | 100279 | 5.7% |
| d | 99107 | 5.6% |
| t | 85246 | 4.8% |
| f | 82933 | 4.7% |
| p | 64088 | 3.6% |
| Other values (42) | 501928 |
MISSING 
| Distinct | 51 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 5417906 |
| Missing (%) | 98.4% |
| Memory size | 42.0 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 11 |
| Mean length | 13.861194 |
| Min length | 5 |
Characters and Unicode
| Total characters | 1245359 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Unspecified |
|---|---|
| 2nd row | Unspecified |
| 3rd row | Unspecified |
| 4th row | Unspecified |
| 5th row | Unspecified |
| Value | Count | Frequency (%) |
| unspecified | 79006 | |
| pedestrian/bicyclist/other | 3940 | 3.6% |
| pedestrian | 3940 | 3.6% |
| error/confusion | 3940 | 3.6% |
| driver | 1447 | 1.3% |
| inattention/distraction | 1314 | 1.2% |
| to | 1291 | 1.2% |
| failure | 1258 | 1.2% |
| yield | 1235 | 1.1% |
| right-of-way | 1235 | 1.1% |
| Other values (84) | 10646 | 9.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 193095 | |
| e | 192459 | |
| n | 104694 | |
| s | 99760 | |
| d | 91909 | |
| c | 91446 | |
| f | 86663 | |
| p | 80019 | |
| U | 79513 | |
| r | 36741 | 3.0% |
| Other values (42) | 189060 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1086410 | |
| Uppercase Letter | 122747 | 9.9% |
| Space Separator | 19407 | 1.6% |
| Other Punctuation | 13745 | 1.1% |
| Dash Punctuation | 2650 | 0.2% |
| Open Punctuation | 200 | < 0.1% |
| Close Punctuation | 200 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 193095 | |
| e | 192459 | |
| n | 104694 | |
| s | 99760 | |
| d | 91909 | |
| c | 91446 | |
| f | 86663 | |
| p | 80019 | |
| r | 36741 | 3.4% |
| t | 29131 | 2.7% |
| Other values (15) | 80493 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 79513 | |
| P | 8560 | 7.0% |
| C | 5539 | 4.5% |
| O | 5099 | 4.2% |
| D | 4252 | 3.5% |
| B | 4044 | 3.3% |
| E | 3966 | 3.2% |
| I | 2178 | 1.8% |
| T | 1429 | 1.2% |
| R | 1423 | 1.2% |
| Other values (12) | 6744 | 5.5% |
Space Separator
| Value | Count | Frequency (%) |
| 19407 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 13745 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2650 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 200 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 200 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1209157 | |
| Common | 36202 | 2.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 193095 | |
| e | 192459 | |
| n | 104694 | |
| s | 99760 | |
| d | 91909 | |
| c | 91446 | |
| f | 86663 | |
| p | 80019 | |
| U | 79513 | |
| r | 36741 | 3.0% |
| Other values (37) | 152858 |
Common
| Value | Count | Frequency (%) |
| 19407 | ||
| / | 13745 | |
| - | 2650 | 7.3% |
| ( | 200 | 0.6% |
| ) | 200 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1245359 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 193095 | |
| e | 192459 | |
| n | 104694 | |
| s | 99760 | |
| d | 91909 | |
| c | 91446 | |
| f | 86663 | |
| p | 80019 | |
| U | 79513 | |
| r | 36741 | 3.0% |
| Other values (42) | 189060 |
PERSON_SEX
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 614713 |
| Missing (%) | 11.2% |
| Memory size | 42.0 MiB |
| M | |
|---|---|
| F | |
| U |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 4893038 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | U |
|---|---|
| 2nd row | F |
| 3rd row | M |
| 4th row | F |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| M | 2969786 | |
| F | 1490172 | |
| U | 433080 | 7.9% |
| (Missing) | 614713 | 11.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 2969786 | |
| f | 1490172 | |
| u | 433080 | 8.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 2969786 | |
| F | 1490172 | |
| U | 433080 | 8.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 4893038 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 2969786 | |
| F | 1490172 | |
| U | 433080 | 8.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4893038 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 2969786 | |
| F | 1490172 | |
| U | 433080 | 8.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4893038 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 2969786 | |
| F | 1490172 | |
| U | 433080 | 8.9% |